Skip to content

Fix DataFrame feature name warnings in sklearn wrapper models (#540)#692

Open
ugbotueferhire wants to merge 2 commits into
yzhao062:masterfrom
ugbotueferhire:master
Open

Fix DataFrame feature name warnings in sklearn wrapper models (#540)#692
ugbotueferhire wants to merge 2 commits into
yzhao062:masterfrom
ugbotueferhire:master

Conversation

@ugbotueferhire
Copy link
Copy Markdown

All Submissions Basics:

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?
  • Have you checked all Issues to tie the PR to a specific one?

All Submissions Cores:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?
  • Does your submission pass tests, including CircleCI, Travis CI, and AppVeyor?
  • Does your submission have appropriate code coverage? The cutoff threshold is 95% by Coversall.

Closes #540

What does this PR do?

Adds X = check_array(X) to decision_function() in four sklearn-wrapper models (IForest, OCSVM, LOF, GMM) that were missing input validation. This prevents the UserWarning: X has feature names, but <Model> was fitted without feature names warning when using pandas DataFrames.

Root cause: fit() already calls check_array(X), which strips DataFrame column names before fitting the underlying sklearn estimator. But decision_function() was passing the raw DataFrame directly to sklearn at predict time, causing a feature-name mismatch warning.

Three other wrapper models (MCD, PCA, HDBSCAN) already had check_array(X) in their decision_function() and were unaffected.

Changes

  • pyod/models/iforest.py — add X = check_array(X) in decision_function()
  • pyod/models/ocsvm.py — same
  • pyod/models/lof.py — same
  • pyod/models/gmm.py — same
  • pyod/test/test_iforest.py — add regression test test_dataframe_no_feature_name_warning

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d8cdccc56c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread pyod/models/lof.py Outdated
Comment thread pyod/test/test_iforest.py Outdated
@ugbotueferhire
Copy link
Copy Markdown
Author

ugbotueferhire commented May 31, 2026

Hi @yzhao062, I’ve addressed the Codex review feedback on this PR and pushed the fixes.

The latest commit preserves sparse input support in LOF.decision_function and makes the pandas DataFrame regression test skip cleanly when pandas is unavailable.

Could you please take another look and merge when you have a chance? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

False positive warning when manipulating pandas dataframes

1 participant